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In  an  effort  to  develop  a  Standard 
Reference  Material  (SRM^'^)  for  Seebeck 
coefficient,  we  have  conducted  a 
round-robin  measurement  survey  of  two 
candidate  materials — undoped  Bi2Te3  and 
Constantan  (55  %  Cu  and  45  %  Ni  alloy). 
Measurements  were  performed  in  two 
rounds  by  twelve  laboratories  involved  in 
active  thermoelectric  research  using  a 
number  of  different  commercial  and 
custom-built  measurement  systems  and 
techniques.  In  this  paper  we  report  the 
detailed  statistical  analyses  on  the 
interlaboratory  measurement  results  and 
the  statistical  methodology  for  analysis  of 
irregularly  sampled  measurement  curves  in 
the  interlaboratory  study  setting.  Based  on 
these  results,  we  have  selected  Bi2Te3  as 
the  prototype  standard  material.  Once 
available,  this  SRM  will  be  useful  for 
future  interlaboratory  data  comparison  and 
instrument  calibrations. 
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round-robin;  Seebeck  coefficient;  Standard 
Reference  Material;  thermoelectric. 


Accepted:  November  28,  2008 


Available  online:  http://www.nist.gov/jres 


37 


Report  Documentation  Page 

Form  Approved 

0MB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  0MB  control  number. 

1.  REPORT  DATE 

NOV  2008  2.  REPORT  TYPE 

3.  DATES  COVERED 

00-00-2008  to  00-00-2008 

4.  TITLE  AND  SUBTITLE 

Statistical  Analysis  of  a  Round-Robin  Measurement  Survey  of  Two 
Candidate  Materials  for  a  Seebeck  Coefficient  Standard  Reference 
Material 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Naval  Surface  Warfare  Center, West  Bethesda,MD, 20817 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR’S  ACRONYM(S) 

11.  SPONSOR/MONITOR’S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBJECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF:  17.  LIMITATION  OF 

_ _ _  ABSTRACT 

18.  NUMBER  19a.  NAME  OF 

OF  PAGES  RESPONSIBLE  PERSON 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE  Same  aS 

unclassified  unclassified  unclassified  Report  (SAR) 

19 

Standard  Form  298  (Rev.  8-98} 

Prescribed  by  ANSI  Std  Z39-18 


Volume  114,  Number  1,  January-February  2009 

Journal  of  Research  of  the  National  Institute  of  Standards  and  Technology 


1,  Introduction 

Thermoelectricity  is  the  study  of  the  direct  conver¬ 
sion  between  thermal  and  electrical  energy  through  the 
Seebeck  and  Peltier  effects.  In  the  Seebeck  effect,  a 
potential  difference  arises  when  a  junction  between  two 
dissimilar  conductors  is  heated  or  cooled  [IJ.the 
Seebeck  effect  can  be  used  for  power  generation  appli¬ 
cations.  Conversely,  when  a  current  passes  through  the 
junction  between  two  dissimilar  conductors,  heat  is 
absorbed  or  expelled  at  the  junction  depending  on  the 
direction  of  current  flow.  This  is  known  as  the  Peltier 
effect  and  can  be  used  for  electronic  refrigeration  [2]. 

Seebeck  coefficient  (a)  is  defined  as  the  voltage  (V) 
generated  per  degree  of  temperature  difference  between 
two  points  {a  =  AVIAT).  The  Seebeck  effect  has  been 
used  by  NASA  to  supply  power  for  deep  space  probes 
in  its  radioisotope  thermoelectric  generators  (RTGs) 
and  is  of  current  interest  to  automobile  manufacturers 
to  supply  additional  power  through  waste  heat  recov¬ 
ery.  RTGs  have  provided  long  term  reliability  with 
some  deep  space  probes  approaching  three  decades  of 
constant  operation.  The  Peltier  effect  can  be  used  for 
electronics  spot  cooling  of  computer  processors  and  has 
widely  been  used  to  thermally  manage  optoelectronic 
devices  such  as  communication  lasers  and  infra-red 
detectors.  A  more  common  use  is  in  portable 
heaters/coolers  that  can  be  purchased  inexpensively  at 
many  local  stores.  While  wider  use  of  thermoelectrics 
in  more  mainstream  applications  holds  great  promise 
because  of  their  high  reliability  and  environmental 
friendliness,  the  low  efficiency  with  which  they  operate 
has  restricted  their  usage.  Recently,  there  has  been  a 
resurgence  of  activity  in  this  field  to  find  novel  materi¬ 
als  that  can  operate  with  higher  efficiency  to  provide 
alternative  power  generation  options  and  competition 
with  conventional  refrigeration  technology. 

The  efficiency  of  a  thermoelectric  material  is  direct¬ 
ly  related  to  the  thermoelectric  figure  of  merit  ZT  given 
by  c^gTIk  where  G  is  the  electrical  conductivity,  k  is 
the  thermal  conductivity,  and  T  is  the  absolute  temper¬ 
ature.  The  current  state  of  the  art  thermoelectric 
materials  from  the  (Bi,_xSbx)2(Tei_YSeY)3,  Bii_xSbx, 
Sii_xGex,  and  PbTe  systems  all  have  maximum  ZT 
values  of  around  1  at  their  respective  optimum  temper¬ 
atures.  Although  this  value  has  been  the  maximum  for 
over  40  years,  there  exists  no  theoretical  reason  for  this 
to  be  absolute  limit  [3].  Several  recent  reports  have 
indicated  that  much  higher  ZTs  are  possible  both  in  thin 
film  superlattices  [4]  and  in  bulk  materials  [5].  AZT of 
3  to  4  would  indicate  an  efficiency  great  enough  to 
allow  direct  competition  with  conventional  refrigera¬ 


tion  devices  [6].  While  full  evaluation  of  a  material 
requires  measurement  of  the  electrical  resistivity  or 
conductivity,  Seebeck  coefficient  and  thermal  conduc¬ 
tivity,  measurement  of  just  the  Seebeck  coefficient  can 
filter  out  those  materials  which  do  not  have  the  desired 
thermoelectric  properties.  There  exists  a  minimum 
Seebeck  coefficient  that  must  be  achieved  to  give  a 
desired  ZT.  If  this  Seebeck  coefficient  is  not  achieved, 
the  material  does  not  warrant  further  study  as  the  other 
properties  can  not  overcome  a  deficiency  in  the 
Seebeck  coefficient.  For  ZT=  1,  the  Seebeck  coeffi¬ 
cient  must  be  >  157  pV/K;  for  ZT=2,  the  Seebeck 
coefficient  must  be  >  222  pV/K.  The  derivation  of  this 
minimum  Seebeck  coefficient  assumes  the  ideal  case  in 
which  the  lattice  thermal  conductivity  is  zero.  Because 
the  lattice  thermal  conductivity  will  not  be  zero  in  any 
real  system,  the  actual  Seebeck  coefficient  must  be 
somewhat  higher  [7]. 

One  of  the  needs  that  persist  in  this  research  field  is 
that  of  a  Seebeck  coefficient  standard  reference  materi¬ 
al  (SRM)  to  help  ensure  reliable  measurements  and 
characterization.  Researchers  building  measurement 
equipment  need  to  be  able  to  calibrate  their  systems  to 
known  values  in  order  to  ensure  consistency  with 
different  equipment  in  other  laboratories.  Numerous 
laboratories  perform  thermoelectric  materials  charac¬ 
terization  through  measurement  of  the  electrical  resis¬ 
tivity  or  conductivity,  thermal  conductivity,  and 
Seebeck  coefficient.  These  required  measurements  are 
demanding,  especially  the  thermal  conductivity  meas¬ 
urements;  however,  one  of  the  most  important  initial 
measurements  is  that  of  the  Seebeck  coefficient  due  to 
the  minimum  requirements.  Standard  reference  materi¬ 
als  exist  for  thermal  conductivity  and  electrical  conduc¬ 
tivity,  and  there  are  reliable  low  Seebeck  coefficient 
materials  such  as  Pb  or  Pt;  however,  there  is  no  high 
Seebeck  coefficient  SRM  [8]. 

1.1  National  Institute  of  Standards  and  Technology 
(NIST)  and  Thermoelectrics 

Research  efforts  at  NIST  are  guided  by  the  NIST 
mission  and  vision  statements.  The  NIST  mission  is 
“to  promote  U.S.  innovation  and  industrial  competitive¬ 
ness  by  advancing  measurement  science,  standards,  and 
technology  in  ways  that  enhance  economic  security  and 
improve  quality  of  life.”  The  NIST  vision  is  “to  be  the 
global  leader  in  measurement  and  enabling  technology, 
delivering  outstanding  value  to  the  nation.” 

With  respect  to  the  thermoelectric  research  commu¬ 
nity,  the  NIST  mission  and  vision  can  be  applied  in  two 
areas.  First,  NIST  can  help  develop  the  metrology  of 
thermoelectric  measurements.  A  number  of  excellent 
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themoelectric  measurement  techniques  are  currently  in 
use  by  the  research  community.  However,  these  can  be 
improved  and  new  measurement  techniques  developed. 
Second,  NIST  can  provide  guidance  and  objectivity  in 
measurements.  This  can  be  accomplished  through 
development  of  standardized  measurement  procedures 
and  methodologies,  objective  testing  of  results,  uncer¬ 
tainty  assessment,  and  development  of  standard  refer¬ 
ence  materials. 

The  NIST  Standard  Reference  Material  (SRM)  pro¬ 
gram  currently  offers  over  1100  SRMs  which  are  used 
for  a  variety  of  purposes  such  as  instrument  calibrations, 
accuracy  verification,  and  new  measurement  techniques 
development.  However,  the  program  has  not  previousy 
looked  at  thermoelectric  materials.  As  mentioned  previ¬ 
ously,  full  characterization  of  a  thermoelectric  material 
requires  measurement  of  the  Seebeck  coefficient,  electri¬ 
cal  resistivity,  and  thermal  conductivity,  usually  as  a 
function  of  temperature.  SRMs  are  currently  available 
for  the  electrical  resistivity  and  thermal  conductivity. 
These  are  SRM  8420/8421  (electrolytic  iron)  and 
SRM  8424/8426  (graphite).  Except  for  the  electrical 
resistivity  of  graphite,  the  range  of  values  covered  by 
these  SRMs  is  not  typical  of  thermoelectric  materials  and 
hence  not  appropriate  to  calibration  of  measurement 
equipment  used  in  the  field.  While  these  SRMs  are  not 
ideal,  they  do  at  least  exist.  There  is  no  SRM  for  the 
Seebeck  coefficient  however.  This  is  a  void  that  needs  to 
be  filled  as  it  is  much  needed  by  the  thermoelectric 
research  community. 

1.2  Thermoelectric  SRM  Requirements 

A  number  of  aspects  had  to  be  considered  when  devel¬ 
oping  the  Seebeck  SRM.  First,  the  material  had  to 
possess  long-term  stability.  In  addition,  the  material 
should  be  homogeneous  and  be  able  to  be  produced  in  a 
large  consistent  batch.  This  is  because  of  the  time  and 
cost  which  would  be  required  to  individually  certify  each 
individual  sample.  Rather,  a  large  homogeneous  batch 
would  allow  for  measurements  of  representative  samples 
to  provide  data  indicative  of  the  whole  batch.  Second, 
the  SRM  had  to  be  certified  over  a  broad  temperature 
range  as  most  researchers  in  this  field  perform  tempera¬ 
ture  dependent  measurements.  Measurements  are  usual¬ 
ly  divided  into  the  low  temperature  regime  (<  300  K) 
and  high  temperature  regime  (>  400  K).  Thermoelectric 
research  is  active  in  both  temperature  regimes  making 
SRMs  needed  for  both.  While  there  is  normally  some 
overlap  between  these  regimes,  they  typically  require 
different  measurement  equipment.  Because  of  this,  we 
determined  that  this  SRM  would  be  focused  on  one 
temperature  regime.  Third,  it  is  important  that  the  SRM 


possess  a  Seebeck  coefficient  that  has  magnitude  on 
the  order  of  that  typically  measured  in  the  field. 
These  values  should  be  somewhere  from  25  pV/K  to 
400  pV/K.  Somewhere  in  the  middle  of  this  range  would 
be  ideal.  Fourth,  the  SRM  should  be  available  at  a  rea¬ 
sonable  price  to  the  community;  therefore  the  develop¬ 
ment  and  production  must  be  cost-effective.  Also,  there 
should  be  sufficient  demand  for  the  SRM  which  in  turn 
has  an  impact  on  the  price.  Fifth,  as  we  consider  devel¬ 
opment  of  the  SRM,  some  thought  must  be  given  to 
future  SRMs.  It  might  be  possible  to  use  the  same 
material  for  future  thermoelectric-related  SRMs  if 
chosen  properly.  Future  SRMs  could  be  produced  over  a 
broader  or  different  temperature  range,  for  different 
properties  or  for  ZT,  or  for  other  sample  geometries  such 
as  thin  film. 

2,  Round-Robin  Measurement  Survey' 

We  initiated  a  measurement  survey  to  determine  the 
feasibility  of  producing  the  SRM,  the  consistency  of  the 
candidate  materials,  and  the  best  measurement  technique 
for  providing  the  standard  data.  Two  candidate  materials 
were  chosen.  Constantan  is  well  known  as  a  simple  alloy 
(55  %  Cu/45  %  Ni)  commonly  used  in  thermocouples 
with  a  moderate  Seebeck  coefficient  at  room  tempera¬ 
ture.  Cylindrical  samples  (6.47  mm  long  by  3.45  mm 
diameter)  were  purchased  from  Concept  Alloys.  Bi2Te3 
is  a  state  of  the  art  thermoelectric  material  with  a  high 
Seebeck  coefficient  at  room  temperature.  Undoped 
samples  were  obtained  from  Marlow  Industries  in  a 
rectangular  shape  (6.08  mm  long  by  3.04  mm  square). 

Although  standards  are  needed  in  both  the  low  and 
high  temperature  regimes,  for  this  SRM  we  decided  to 
focus  on  the  low  temperature  range  from  10  K  to  390  K. 
This  decision  was  made  because  of  previous  experimen¬ 
tal  experience  in  this  temperature  regime  and  the  avail¬ 
ability  of  measurement  equipment.  While  this  standard 
primarily  provides  data  for  the  low  temperature  regime, 
it  will  also  provide  some  overlap  with  the  low  end  of 
high  temperature  equipment  until  a  standard  can  be 
provided  for  those  temperatures. 

A  number  of  laboratories  were  enlisted  to  participate 
in  this  survey.  These  are  a  mixture  of  laboratories 
involved  actively  in  thermoelectric  research  and  repre¬ 
sent  industry,  university,  and  government  laboratories 


The  purpose  of  identifying  the  equipment  in  this  article  is  to 
specify  the  experimental  procedure.  Such  identification  does  not 
imply  recommendation  or  endorsement  by  the  National  Institute  of 
Standards  and  Technology. 
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both  domestic  and  international.  These  participants  and 
the  primary  researcher  from  each  are  listed  in  Table  1. 


Table  1.  Round-robin  measurement  survey  participants 


Primary  Researcher 

Laboratory 

Neil  Dilley 

Quantum  Design 

Norbert  Eisner 

Hi-Z  Technology 

Tim  Hogan 

Michigan  State  University 

Qiang  Li 

Brookhaven  National  Laboratory 

Nathan  Lowhom 

National  Institute  of  Standards 

and  Technology 

George  Nolas 

University  of  South  Florida 

Hamhiko  Obara 

National  Institute  of  Advanced 

Industrial  Science  and 

Technology — Japan 

Jeffrey  Sharp 

Marlow  Industries 

Teny  Tritt 

Clemson  University 

Rama  Venkatasubramanian 

RTI  International 

Rhonda  Wiiligan 

United  Technologies 

Jihui  Yang 

General  Motors 

2.1  Measurement  Equipment 

A  number  of  measurement  systems  were  used  in  this 
study  including  both  commercial  and  custom-built 
systems.  The  measurements  were  carried  out  with 
several  different  measurement  techniques  (some  systems 
were  capable  of  multiple  techniques). 

2.1.1  Commercial  Systems 

The  Quantum  Design  Physical  Property  Measurement 
System  (PPMS)  with  Thermal  Transport  Option  (TTO) 
is  a  versatile  system  which  can  measure  the  Seebeck 
coefficient  from  2  K  to  400  K  in  several  different  modes, 
each  of  which  was  used  in  this  study.  Samples  can  be 
mounted  in  either  a  2  or  4-probe  configuration,  and 
measurements  can  be  performed  with  a  stable  sample 
temperature  or  dynamic  sample  temperature  (usually 
<  0.5  K/min).  The  dynamic  measurements  continuously 
monitor  the  AT  and  AV  along  the  sample  while  supply¬ 
ing  a  heat  pulse  to  one  end  and  slowly  varying  the 
sample  temperature.  This  approach  gives  the  ability  to 
measure  the  Seebeck  coefficient  as  a  function  of  temper¬ 
ature  without  having  to  wait  for  stability  and  data 
collection  at  each  temperature.  The  steady-state  values 
for  AT  and  AV  are  found  by  extrapolating  the  data  from 
a  relatively  short  heat  pulse.  This  system  prefers  a 
sample  geometry  such  that  the  thermal  conductance  at 
300  K  is  between  1-5  mW/K  for  2-probe  measurements. 


Bar-  or  disc-shaped,  gold-plated,  copper  contact  leads 
were  used  and  attached  to  the  sample  with  either  solder 
or  silver  epoxy  (EpoTek  H20E).  The  versatility  of  this 
system  also  allows  for  integrating  3'^“*  party  electronics 
and/or  software  to  perform  custom  measurements.  One 
laboratory  provided  data  using  this  system  with  a 
Keithley  nanovoltmeter  to  measure  the  Seebeck  voltage 
while  performing  a  direct  steady-state  DC  measurement. 

The  ULVAC  RIKO  ZEM-2  system  performs  a  steady- 
state  sweep  technique  and  operates  in  two  modes  to 
cover  different  temperature  regimes.  The  cryostat  mode 
allows  measurements  from  193  K-  373  K  while  the 
furnace  mode  allows  measurements  from  room  tempera¬ 
ture  to  1273  K.  This  system  prefers  samples  13  mm  or 
longer  while  at  least  8  mm  of  length  is  recommended  by 
the  vendor.  Using  samples  shorter  than  this  length  intro¬ 
duces  error  due  to  smaller  probe  spacing  and  temperature 
difference.  The  samples  in  this  study  were  only  6  mm 
long  and  required  extenders  to  span  the  length  not  cov¬ 
ered  by  the  sample.  A  4-probe  measurement  geometry 
was  used  with  chromel  or  platinum  lead  wires  attached 
to  the  ends  of  the  samples  and  Type  K  (Type  M8  and  L) 
or  R(Type  MIO)  thermocouple  probes  attached  to  the 
sides.  In  this  steady-state  sweep  technique,  the  sample 
was  held  at  a  constant  temperature  while  one  end  of 
the  sample  was  heated  to  produce  a  constant  tempera¬ 
ture  gradient.  The  temperature  and  voltage  difference 
between  the  thermocouple  probes  was  measured.  The 
next  temperature  diference  value  was  attained,  and 
measurements  were  repeated.  After  all  temperature 
difference  setpoints  at  a  particular  sample  temperature 
were  covered,  the  slope  of  the  voltage  difference  (AV)  vs 
temperature  difference  (AT)  gave  the  Seebeck  coefficient 
at  that  sample  temperature.  After  this,  the  sample  tem¬ 
perature  was  changed,  and  the  measurement  was 
repeated. 

2.1.2  Custom  Systems 

Three  laboratories  used  systems  which  allowed  for 
measurements  over  a  broad  temperature  range  covering 
much  of  the  target  range  for  this  study.  Each  of  these 
employed  different  measurement  techniques  and  sample 
mounting,  however. 

The  first  system  used  a  steady-state  sweep  technique 
in  which  the  sample  was  held  at  a  constant  temperature 
and  the  AT  was  slowly  ramped  through  a  range  of  values 
while  monitoring  the  AV.  The  data  was  linearly  fit,  and 
the  slope  yielded  the  Seebeck  coefficient.  A  small  resis¬ 
tor  was  epoxied  to  the  top  of  the  sample,  and  the  oppo¬ 
site  end  was  soldered  to  a  heat  sink.  Two  differential 
thermocouple  contacts  were  made  to  the  sides  of  the 
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sample  for  measuring  the  AT,  and  a  thermocouple 
epoxied  between  the  differential  thermocouple  contacts 
measured  the  average  sample  temperature. 

The  second  system  used  a  4-probe  configuration  in 
which  current  was  pulsed  through  a  small  platinum 
heater  resistor  on  one  end  of  the  sample  to  generate  the 
AT.  The  other  end  of  the  sample  was  attached  to  the 
probe  using  solder  or  silver  paste.  Silver  paste  was  used 
to  attach  type-E  thermocouples  to  the  sample  to 
measure  the  AT. 

The  third  system  used  a  pseudo-steady-state  tech¬ 
nique  in  which  a  constant  AT  was  applied  along  the 
sample,  and  measurements  of  the  AV  were  made  as  the 
sample  temperature  was  slowly  changed  (<  1  K/min). 
A  smaller  AT  calculated  from  a  percentage  of  the 
sample  temperature  was  used  as  the  temperature  was 
decreased.  Samples  were  soldered  between  2  copper 
blocks  which  acted  as  voltage  probes  for  measuring  the 
AV.  The  junctions  of  a  differential  thermocouple  were 
embedded  in  the  copper  blocks  to  measure  the  AT. 

The  other  systems  only  measured  at  or  near  room 
temperature.  Three  of  these  used  a  simple  AT  sweep 
technique  but  had  sight  sample  mounting  variations.  In 
the  first  technique,  copper  end  caps  were  soldered  to 
the  ends  of  the  sample,  and  each  cap  included  a  copper 
wire  and  a  3  mil  Type  T  thermocouple.  One  end  of  the 
system  was  thermally  sunk  to  a  thermoelectric  cooler 
to  provide  basic  sample  temperature  control.  In  the 
second  technique,  samples  were  mounted  between  2 
copper  blocks  and  partially  exposed  above  the  blocks. 
To  the  exposed  parts,  voltage  and  thermocouple  probes 
were  attached.  Cartridge  heaters  were  embedded  in 
each  block  to  control  the  AT.  Two  measurements  were 
performed  at  each  temperature  with  reversed  thermo¬ 
couples  to  account  for  thermocouple  variations.  The 
sample  was  slowly  swept  through  a  range  of  AT  values 
which  centered  on  the  temperature  being  measured.  In 
the  third  technique,  samples  were  clamped  between 
two  clean  copper  blocks  each  embedded  with  a  heater 
and  thermocouple.  The  blocks  were  held  at  different 
temperatures  and  ramped  slowly  through  different  AT 
values  while  the  AV  was  recorded.  A  linear  fit  to  the 
data  gave  the  Seebeck  coefficient. 

One  of  the  other  systems  used  a  basic  single  point 
measurement.  Samples  were  mounted  between  2  nickel- 
plated  copper  blocks  held  at  different  temperatures  to 
produce  a  AT  along  the  sample.  The  between  the 
2  blocks  was  measured  and  divided  by  the  AT  to  give 
the  Seebeck  coefficient. 


The  last  system  used  a  Harman  technique  in  which  a 
AT  was  produced  along  the  sample  by  means  of  the 
Peltier  effect  when  a  current  was  passed  through  the 
sample.  After  stabilization,  the  current  was  switched 
off;  and  the  ohmic  and  Seebeck  voltages  were  separat¬ 
ed  from  the  total  voltage.  Measurements  were  repeated 
using  opposite  current  sense  to  account  for  thermo¬ 
couple  differences  and  voltmeter  offsets. 

2.2  Round-Robin  Procedure 

The  measurements  were  conducted  in  two  rounds  to 
allow  each  sample  to  be  measured  by  2  different 
laboratories  and  provide  a  good  amount  of  comparative 
data  while  working  within  the  time  constraints  of  the 
project  and  the  participants.  The  ideal  situation  would 
be  where  each  sample  is  measured  by  all  laboratories. 
However,  due  to  the  nature  of  these  measurements,  this 
would  require  an  extreme  time  commitment  by  each 
laboratory  and  would  greatly  lengthen  the  SRM  project 
as  a  whole.  This  was  not  practical.  The  procedure  we 
used  allowed  each  measurement  technique  to  be  per¬ 
formed  on  2  different  samples  and  for  each  sample  to 
be  measured  by  2  different  laboratories.  Also,  multiple 
samples  were  measured  at  NIST  using  one  technique  to 
provide  additional  sample  consistency  data. 

Two  samples  of  each  candidate  material  were  sent  to 
each  laboratory.  One  sample  of  each  was  to  be  meas¬ 
ured  while  the  other  served  as  a  backup.  Some  labora¬ 
tories  provided  data  on  both  samples.  Each  laboratory 
was  asked  to  perform  a  minimum  of  2  measurements 
on  each  sample  and  more  if  necessary  to  provide  confi¬ 
dence  in  the  final  data.  Also,  each  laboratory  was  asked 
to  use  their  normal  techniques  and  multiple  techniques 
if  available  and  if  time  allowed. 

The  measured  samples  were  then  sent  back  to  NIST 
where  they  were  randomly  assigned  to  a  different 
laboratory  for  the  second  round  of  measurements. 
Other  switching  arrangements  were  discussed  and 
considered  at  length.  We  considered  hand  selecting 
some  of  the  switching  to  insure  certain  comparisons 
would  be  made  between  specific  laboratories  and  their 
measurement  techniques.  In  the  end,  however,  it  was 
decided  it  would  be  better  to  allow  switching  to  be 
fully  random  so  that  the  broadest  number  of  compar¬ 
isons  would  be  possible.  The  samples  were  then  sent 
out  to  the  laboratories  again  for  the  second  round  of 
measurements. 
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3,  Measurement  Data  and  Parametric 
Representation 

There  are  issues  whieh  present  difficulty  when  analyz¬ 
ing  and  combining  measurement  data  curves  from 
different  measurements,  laboratories,  or  techniques. 
First,  the  data  covers  different  temperature  ranges  with 
different  numbers  of  sampling  points  or  data  density. 
We  assign  numerical  labels  1,  2,  3,  4,  5,  6,  7,  8,  9,  10 
for  the  10  laboratories  whose  data  are  accepted,  and  we 
use  decimal  points  within  each  interval  to  represent 
the  different  datasets  from  a  particular  laboratory.  The 


temperature  sampling  points  for  all  measurement  data 
are  shown  in  Fig.  1  for  Constantan  and  Fig.  2  for 
Bi2Te3.  Each  color/numeric  label  represents  all  the  data 
from  a  particular  laboratory.  It  is  seen  that  the  tempera¬ 
ture  range  and  density  of  each  measurement  data  set 
differs  greatly  between  laboratories,  and  even  within 
the  same  laboratory.  These  variations  cause  difficulty 
when  comparing  and  combining  the  different  measure¬ 
ments.  We  use  a  parametric  model  for  the  measurement 
curves  in  order  to  interpolate  and  to  analyze  multiple 
curves. 


Fig.  1.  Density  of  temperature  measurement  data  for  material  Constantan.  The  y-axis  represents  the  numerical  labels  assigned  to  the 
9  out  of  12  laboratories  as  shown  in  Table  1,  and  the  decimal  points  represent  different  datasets  from  the  given  laboratory.  The  tem¬ 
perature  unit  is  Kelvin  (K).  The  same  color  and  numeric  label  are  used  for  all  data  from  each  particular  laboratory. 
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Fig.  2.  Density  of  temperature  measurement  data  for  material  Bi2Te3.  The  y-axis  represents  the  numerical  labels 
assigned  to  the  10  out  of  12  laboratories  as  shown  in  Table  1,  and  the  decimal  points  represent  different  datasets 
from  the  given  laboratory.  The  temperature  unit  is  Kelvin  (K).  The  same  color  and  numeric  label  are  used  for  all 
data  from  each  particular  laboratory. 


3.1  Parametric  Interpolating  Model 

In  order  to  analyze  the  variability  in  the  irregularly 
and  sparsely  sampled  measurement  data,  we  first  enter¬ 
tain  data  representation  through  parametric  models  via 
multiple  regression  analysis  [9].  We  imagine  each  indi¬ 
vidual  measurement  data  set  from  one  of  the  m  labora¬ 
tories  consists  of 


yij^ilk)  =  fiiity,)+ey(ty,), 


where  denotes  the  measurements  at  temperature 

points  by  the  yth  measurement  set  within  the  /th 
laboratory,  and/o(lft)  is  the  common  (true)  curve  eval¬ 
uated  at  The  measurement  errors  (including  inter¬ 
polation,  laboratory,  and  sample  variability,  etc.)  and 
lack  of  fit  error  due  to  the  use  of  a  parametric  model  are 
summarized  by  the  residual  error  term  e,^  (t,yi-)  which  is 
assumed  to  have  a  normal  distribution  N{Q, 


where  should  include  the  parametric  model 

error  for  the  yth  measurement  of  the  /th  laboratory.  We 
use  a  parametric  model  for/J,(/,yj.).  The  purpose  of  the 
model  is  to  adequately  approximate  the  data  with  a 
parametric  form;  there  is  no  physical  meaning  associat¬ 
ed  with  the  parameters.  The  benefit  is  to  have  a  set  of 
finite-dimensional  parameters  as  a  proxy  summary  of 
individual  measurement  curves. 

Applying  (1),  we  identified  a  multiple  linear  regres¬ 
sion  model  [10]  which  seemed  to  fit  the  available 
measured  data  set  very  well  (see  also  comments  in 
Sec.  6). 


yij  (tijt  )=  «/,o + log  {tijk  +i)+ 


-Fay.3  sin 


700 


-I-  cos 


700 

V  J 


(2) 


where  j,y(/,yt)  is  the  measured  Seebeck  coefficient 
(pV/K)  at  temperature  /yj. (Kelvin).  The  vector  Uy  = 
(flyo,  fityi,  Uifl,  Uifl,  ajjnY  represents  the  parameterization 
of  the  measured  curve. 
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3.2  Parameter  Estimation 

To  estimate  the  parameters  in  Eq.  (2)  for  each  data 
set,  the  standard  least  squares  method  used  in  our  earli¬ 
er  work  [10]  can  be  improved  due  to  the  instability 
in  the  least  squares  estimator  when  the  measured 
temperature  points  are  few  or  limited  in  a  small  range. 
Let  X  denote  the  ny.p  design  matrix  consisting  of 
5  columns  defined  by  the  regression  terms  in  (2)  and 
rows  which  are  evaluated  at  each  sampling  point.  Let  Y 
denote  the  Seebeck  coefficient  response  vector.  The 
least  squares  estimator  is  given  by 


Q.{k)  = 


-tr{l-A{k)) 

n 


1 

1 - 1 

V 

pdUd^+2k),  ^ 

2  S'  '  y '  u„ty\ 

n 

n  n  W  +  k 

2 

(7) 


^  =  X^Y]  Y  =  Xp.  (3) 

The  problem  with  the  standard  least  squares  method 
applying  to  (2)  is  that  X^X  is  near  singular  when  the 
sample  size  is  small  or  the  temperature  measurement 
range  is  narrow.  As  a  consequence,  the  estimated 
parameters  can  be  highly  variable  and  unstable;  and  the 
uncertainties  associated  with  the  estimated  parameters 
are  extremely  large.  To  alleviate  the  problem  one  can 
use  the  Ridge  regression  method  [11]  by  introduction 
of  smoothing  parameter  k  to  stabilize  the  inverse 
computation  given  by 

P„={X^X+kl)'  X^Y]  Y„=Xp^.  (4) 

If  we  denote  the  singular  value  decomposition  of  X  by 
UDV^,  thenX^X=  VD^V\ 


In  practice,  we  find  that  the  smallest  k  among  the 
feasible  values  is  always  preferred.  This  indicates  that 
our  chosen  estimators  are  close  to  those  given  by 
using  the  generalized  inverses.  If  we  let  X^  denote 
the  Moore-Penrose  inverse  of  a  matrix  X,  then  X*  = 
{X^XpX^-,  and  X^  satisfies  the  following  conditions 
[13]: 

X^X,  XX*  are  symmetric  (8) 

XX*X  =  X,X*XX*  =X*.  (9) 

\fX=  UDV^,  thenA^  =  VD  where!)  ^  is  the  trans¬ 
pose  of  D  whose  positive  singular  values  are  replaced 
by  their  reciprocals.  When  k  — ^  0,  the  Ridge  regression 
estimator  in  (4)  converges  to  the  Moore-Penrose  gener¬ 
alized  inverse  estimator  given  by: 


=  (VD^V^  +kiy  VDU^Y  =  V(p^  +  kiy  DU^Y 


f 


5t+k 


(ufYy 


(5) 


where  Z)=dmg{5„...,5p},U=(u„...,  Mp),  F  =(![,...,  Vp 
and 


P^=X*Y  =  {X^XJ  X^Y.  (10) 

The  estimator  is  a  least  squares  solution  to  the  follow¬ 
ing  problem:  its  norm  \\P\\2  is  minimized  among  all 
vectors  P  for  which 


\Y-m2  (11) 


i;  =  UD(p^  +kiy  DU^Y  =  ^ 


d^+k 

V  '  / 


(uj  Yy .  (6) 


Also,  if  we  denote  A(k)  =  UD{D- +  kl)  ^  DU  then 
Y,  =  A{k)Y. 


is  minimized.  The  corresponding  fitted  regression  line 
is  given  by 

Y^=X{X^X^  X^Y  =  XX*Y.  (12) 
The  covariance  of  p^  is  given  by 


The  choice  of  k  requires  careful  considerations.  A 
large  k  reduces  the  variance  in  the  resulting  estimator 
while  incurring  potentially  large  bias.  We  try  to  select  k 
that  gives  a  stable  estimator  and  has  negligible  bias.  A 
formal  procedure  for  choosing  k  is  based  on  the 
Generalized  Cross-validation  criterion  [12]  by  mini¬ 
mizing  the  prediction  variance 


Cov{PP=o\X^xy  (13) 

where  we  assume  Cov(T)  =  Note  that  the  Ridge 
regression  estimator  may  be  biased.  A  useful  notion  is 
estimable  function  (or  linear  combination  of  para¬ 
meters)  for  which  there  exists  unbiased  estimate  based 
on  linear  combination  of  data.  This  is  the  essence  of  the 
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theory  of  the  Gauss-Markov  model  and  for  estimable 
functions  there  are  simplifying  expressions  for  uncer¬ 
tainty  analysis  [14]. 

The  adequacy  and  validity  of  the  parametric  model 
as  an  approximate  representation  of  the  measurement 
data  curves  can  be  checked  via  comparison  to  the  non- 


parametric  model  results  using  the  locally  weighted 
regression  (LOWESS),  which  is  available  in  S-plus^ 
and  other  statistical  softwares  [15,  16]. 

If  we  accept  that  Eq.  (2)  provides  an  adequate  re¬ 
presentation  of  measurement  data  curves  across  differ¬ 
ent  samples  and  laboratories,  see  Fig.  3  and  Fig.  4, 


Fig.  3.  Fitted  measurement  curves  by  laboratory  on  the  Constantan  material. 
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Fig.  4.  Fitted  measurement  curves  by  laboratory  on  the  Bi2Te3  material. 


2 

S-plus  is  a  trademark  of  Insightful  Corporation.  Mention  of  a  soft¬ 
ware  product  in  this  paper  is  only  to  illustrate  and  to  make  explicit 
the  statistical  procedures  used  in  our  data  analysis,  and  does  not 
imply  in  anyway  the  endorsements  of  NIST. 
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the  question  arises  as  how  much  meaning  can  be 
attached  to  the  parameters  and  how  much  the  variabili¬ 
ty  in  parameter  estimates  can  account  for  the  measure¬ 
ment  variability  across  samples  or  laboratories.  Two 
measurement  data  curves  may  have  different  represen¬ 
tation  with  vastly  different  coefficients  due  to  the  dif¬ 
ference  in  measurement  data  range  and  due  to  instabil¬ 
ity  from  under-sampling  and  over-parameterization 
within  the  data  range.  The  data  range  is  likely  the  result 
of  different  measurement  equipment  used.  When  the 
number  of  sampling  points  is  small  or  when  the  meas¬ 
ured  data  points  do  not  support  the  complexity  of  the 
presumed  model,  the  Ridge  regression  approach 
becomes  a  preferred  one  to  use  over  the  standard  least 
squares  method.  The  lack  of  parameter  identifiability  or 
parameter  redundancy  is  a  well-known  problem  in 
nonlinear  regression  [17,  18]  and  can  be  caused  by  the 
intrinsic  nature  of  parameterization  in  nonlinear  repre¬ 
sentations.  Because  of  this,  our  view  is  to  use  the  para¬ 
metric  representation  as  an  interpolation  tool  only;  and 
it  appears  that  the  fitted  parameters  do  not  have  much 
use  beyond  this  data  summarization  stage. 

4,  Meta  Analysis:  Combining  Irregularly 
Sampled  Curves 

4.1  Consensus  Mean  Curve 

After  we  have  summarized  the  irregularly  sampled 
measurement  data  curves  through  a  parametric  model, 
all  data  among  the  samples  and  laboratories  can  be 
compared  on  the  measured  data  points  or  through  inter¬ 
polations  via  the  parametric  fits.  The  first  important 
issue  is  to  define  the  consensus  mean  curve  for  a  partic¬ 
ular  group  of  measurement  curves.  The  naive  approach 
is  to  use  the  mean  of  the  fitted  regression  coefficients 
which  we  call  the  “mean  regression”  approach,  in 
which  the  regression  coefficients  from  each  measure¬ 
ment  curve  are  weighted  equally.  This  approach  does 
not  work  well  due  to  vast  variability  in  the  parameter 
estimates.  The  second  approach  is  to  fit  a  single  model 
to  all  data  from  that  group  which  we  call  the  “all  data 
regression”  approach.  We  see  that  “all  data  regression” 
approach  appears  to  give  consistently  the  most  sensible 
results.  This  approach  is  equivalent  to  the  weighted 
vector  mean  approach  in  which  the  regression  coeffi¬ 
cient  vectors  are  weighted  according  to  the  inverse  of 
the  least  squares  covariance  matrices,  Eq.  (13)  [19]. 
However,  we  caution  the  readers  that  the  regression 
coefficient  vectors  are  too  heterogenous  to  be  analyzed 
using  standard  statistical  procedures  such  as  meta 
analysis  as  those  mentioned  in  the  comprehensive 


review  by  Becker  and  Wu  [20].  The  reasons  are  that,  in 
addition  to  huge  differences  in  measurement  uncertain¬ 
ty  in  some  measurement  curves  due  to  limited  sampling 
points,  there  are  significant  differences  in  measurement 
data  ranges,  and  there  are  substantial  between-laborato- 
ry  differences  in  the  measured  temperature  points.  All 
these  make  the  resulting  regression  coefficients  less 
comparable,  and  make  direct  analysis  based  on  the  fit¬ 
ted  regression  coefficients  very  difficult.  We  argue  that 
the  regression  coefficients  should  be  treated  as  a  func¬ 
tion  of  data  range  as  well  as  sample  size  and  estimation 
uncertainty.  To  avoid  the  complications,  datasets  which 
have  less  than  5  data  points  in  the  focus  range  were  not 
considered,  since  the  fitted  model  were  completely 
unreliable  or  the  data  were  considered  unreliable  by  the 
contributing  laboratory.  This  resulted  in  55  datasets 
being  used  for  Constantan  and  114  data  sets  being  used 
for  Bi2Te3.  Thus,  when  we  are  comparing  and  evaluat¬ 
ing  the  variability  of  the  measurement  curves,  we  focus 
on  the  interpolated  measurement  curves  based  on  the 
fitted  regression  functions  and  use  interpolated  values 
when  there  are  no  direct  measurement  data. 

4.2  Smooth  Variance  Estimation  and  Confidence 
Intervals 

Another  problem  associated  with  the  statistical 
analysis  of  the  round  robin  data  is  the  development  of  a 
confidence  band  for  the  consensus  mean  curve  m(t).  We 
find  that  the  most  sensible  approach  is  to  first  compute 
the  curves  at  the  desired  range  using  the  coefficients  of 
the  parametric  model  fitted  to  each  data  from  each  lab¬ 
oratory,  and  then  compute  the  pointwise  variance  v(t)as 
the  mean  of  the  squares  of  deviations  of  each  curve 
from  the  central  curve  m{t).  The  pointwise  estimated 
functional  variance  may  be  very  rough,  and  it  can  be 
smoothed  using  LOWESS  with  a  small  bandwidth  (e.g., 
we  use  f  =  0.2,  20  %  of  local  data  points  in  the  local  fit¬ 
ting).  To  compute  the  confidence  band,  we  simply  use 

m{t)  ±  Cylv{t)  with  c  =  2  which  gives  the  pointwise  95  % 

confidence  intervals  (if  the  uncertainty  in  the  variance 
estimate  can  be  ignored).  There  is  an  interesting  inter¬ 
pretation  of  the  pointwise  confidence  intervals:  if  one 
treats  the  two  confidence  bands  as  two  boundary  lines, 
and  calls  any  measured  or  interpolated  values  on  a 
curve  lying  outside  the  two  bounds  the  exceedances 
points,  then  the  percentage  of  the  exceedances  as  a  frac¬ 
tion  of  the  total  temperature  points  summed  over  all 
measurement  curves  tends  to  5  %,  so  asympototically 
the  confidence  intervals  have  the  desired  average  spa¬ 
tial  coverage  probability  of  95  %.  Similar  notion  of 
confidence  intervals  is  discussed  by  Wahba  [21]  who 
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also  coined  the  name  of  Bayesian  confidence  intervals, 
and  by  Nychka  [22]  who  proved  that  the  pointwise 
confidence  intervals  in  the  context  of  a  smoothing 
spline  regression  has  the  required  specified  average 
coverage  probability. 


5,  Statistical  Analysis  Results 


Using  the  “all  data  regression”  approach  and  Eq.  (2), 
we  modeled  all  data  for  the  2  candidate  materials  which 
gave 


y, 


(x)  =  -0.09  +  1.811og  (x+l)-2.79>/x 


+  0.93  sin 


Ittx 

700 


+1.39cos 


2nx 


(14) 


for  Constantan  and 


yi 


(x)  =  -55. 10  -  4.79  log  (x+1  )-2.49  ^/x 


-1. 


ssm 


2nx 

~i^ 


+  57.61cos 


2nx 


(15) 


for  Bi2Te3.  These  results  are  plotted  in  Figs.  3  and  4 
respectively  with  all  the  data  used  for  the  model. 


The  variability  among  the  measurement  curves  is 
quantified  through  variance  function  defined  as  the 
mean  of  the  squares  of  the  deviations  of  each  curve 
from  the  central  curve.  The  variance  function  can  be 
very  rough  at  some  tempearure  range  and  it  is 
smoothed  out  via  the  LOWESS  smoothing  function. 
The  coefficient  of  variation  (CV)  at  each  temperature 
point  is  computed  as  the  standard  deviation  divided  by 
the  absolute  consensus  mean  value.  The  CVs  as  a  func¬ 
tion  of  temperature  for  both  Constantan  and  Bi2Te3 
when  the  variance  function  is  computed  over  all  the 
measurement  curves  of  samples  are  plotted  in  Fig.  5. 
The  standard  deviation  for  Constantan  data  is  increas¬ 
ing  as  a  function  of  temperature,  and  CV  is  nearly  con¬ 
stant  for  temperature  above  100  K.  For  Bi2Te3,  the 
standard  deviation  is  nearly  constant  across  tempera¬ 
ture.  It  is  seen  that  the  CV  for  Bi2Te3  is  smaller  than  the 
CV  for  Constantan.  Based  on  the  results  of  our  data 
analysis,  the  fact  that  Bi2Te3  has  a  larger  absolute 
Seebeck  coefficient  value,  and  also  most  laboratories 
have  measurement  techniques  for  the  Bi2Te3  at  a  wide 
range  of  temperature  values  that  we  are  interested  in, 
we  have  selected  Bi2Te3  as  our  candidate  Standard 
Reference  Material  (see  Sec.  6  for  more  discussion). 
Besides,  Bi2Te3  is  currently  one  of  the  materials  being 
used  by  industry  for  cooling  applications. 


Fig.  5.  Sample-to-sample  measurement  uncertainty  as  a  fraction  of  absolute  consensus  mean  signal  (“b”  for  Bi2Te3;  “c”  for  Constantan). 
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From  Fig.  6  through  Fig.  11,  we  report  the 
deviations  from  the  consensus  mean  curve  due  to 
three  factors  (Sample,  Laboratory,  or  Measurement 
Technique)  that  may  affect  measurement  performance 
for  each  of  the  two  materials,  Constantan  and  Bi2Te3. 
The  samples  were  assigned  randomly  in  the  first  round 
and  then  switched  to  another  laboratory  in  round  two, 
so  there  are  typically  two  or  more  samples  being  meas¬ 
ured  by  each  laboratory.  Each  laboratory  was  asked  to 
use  their  most  reliable  measurement  technique,  and 
some  laboratories  may  have  used  up  to  four  techniques 
for  measurements.  In  this  very  exploratory  experimen¬ 
tal  design  set  we  do  not  apply  rigorous  statistical  design 
involving  orthogonality  in  order  to  separate  the  effect 
of  measurement  techniques  from  the  laboratory,  there¬ 
fore  the  effect  of  laboratory  is  strongly  coupled  with  the 


techniques  being  used.  The  confounding  effect  with 
choice  of  samples  is  less  of  an  issue  since  there  were 
enough  samples  being  measured  and  samples  were  usu¬ 
ally  measured  twice  by  two  different  laboratories.  The 
outlying  measurements  seen  in  Fig.  4  from  a  single 
laboratory  (Lab  6)  show  up  also  in  Figs.  7,  9,  and  11. 
We  believe  this  is  caused  by  a  single  laboratory  using 
measurement  technique  E,  the  reasons  being  that  some 
of  the  same  samples  have  been  measured  by  another 
laboratory  without  producing  the  pronouced  deviations. 
Overall,  we  consider  our  interlaboratory  study  to  be 
successful  in  achieving  good  agreements  in  measure¬ 
ments  from  the  volunteering  participating  laboratories 
and  in  the  identification  of  reliable  measurement  tech¬ 
niques  in  the  desired  wide  temperature  range  which  we 
are  interested  in  pursuing. 
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Fig.  6.  Sample  bias  (deviations  from  the  consensus  mean  curve,  with  potential  laboratory  and  technique  differ¬ 
ences)  for  all  the  samples  used  in  the  studies  for  Constantan. 
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Fig.  7.  Sample  bias  (deviations  from  the  consensus  mean  curve,  with  potential  laboratory  and  technique  differ¬ 
ences)  for  all  the  samples  used  in  the  studies  for  Bi2Te3. 
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Fig.  8.  Laboratory  bias  (deviations  from  the  consensus  mean  curve,  with  potential  sample  and  technique 
differences)  for  Constantan. 
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Fig.  9.  Laboratoiy  bias  (deviations  from  the  consensus  mean  curve,  with  potential  sample  and  technique 
differences)  for  Bi2Te3. 
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Fig.  10.  Measurement  technique  bias  (deviations  from  the  consensus  mean  curve,  with  potential  laboratory  and 
sample  differences)  used  in  the  studies  for  Constantan. 
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sample  differences)  used  in  the  studies  for  Bi9Te3. 


6,  Summary 

To  summarize,  our  procedure  for  statistical  analysis 
of  irregularly  sampled  measurement  curves  in  the  inter¬ 
laboratory  study  consisted  of  the  following  steps. 

1)  Each  measurement  data  is  fitted  to  a  parametric 
model  Eq.  (2).  The  tuning  parameter  choice  in 
the  Ridge  regression  parameter  estimation  and 
goodness  of  fit  are  checked  through  the  nonpara- 
metric  LOWESS  models.  We  arrive  at  a  para¬ 
metric  representation  of  each  measurement 
curve;  and  at  every  temperature  point  within  the 
measurement  range,  the  Seebeck  coefficient  can 
be  computed  based  on  the  fitted  model. 

2)  For  measurement  performance  comparison, 
whether  it  is  sample-to-sample,  laboratory-to- 
laboratory,  or  technique-to-technique,  at  a  given 


common  set  of  temperature  values,  we  compute 
the  predicted  Seebeck  values  on  the  common 
temperature  points  based  on  the  fitted  parametric 
model,  and  then  compute  the  standard  deviation 
at  each  temperature  point. 

3)  The  common  mean  for  multiple  measurements  is 
given  by  fitting  the  parametric  model  (2)  to  all 
the  combined  data. 

4)  The  final  confidence  band  is  given  by  the  com¬ 
mon  mean  plus  or  minus  the  stanadard  deviation 
multipled  by  the  coverage  factor  k  =  2,  which 
gives  the  95  %  average  coverage  probability 
assuming  the  normal  distribution.  The  bias  of 
each  measurement  is  computed  as  the  difference 
between  the  computed  measurement  point  from 
the  fitted  parametric  model  and  that  from  the 
common  mean  model. 
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Our  study  offers  a  few  lessons  which  may  be  benefi¬ 
cial  for  future  design  and  analysis  of  interlaboratory 
experiments  involving  sampled  curves  and  functions. 
The  significant  differences  (cf  Fig.  1  and  Fig.  2)  in  the 
sampling  design  from  different  laboratories  and  differ¬ 
ent  replicates  have  made  analysis  based  on  the  para¬ 
meters  of  an  interpolating  model  unsuitable.  We 
emphasize  that  the  proposed  model  (2)  is  just  one  of 
the  many  interpolating  models  that  can  be  used.  For 
example,  we  have  recently  discovered  another  model  in 
our  latest  Seebeck  coefficient  SRM  work, 

m{t)  =  Qg  +a^t  +  a  2it  — 200)^ 

+  a  git  -  200y  +a,{t -200y 

which  also  fits  the  round  robin  data  well.  Flowever,  we 
should  point  out  that  fitting  of  this  model  to  the  round 
robin  data  still  presents  the  same  challenges  as  the  lin¬ 
ear  terms  cannot  be  reformulated  into  orthogonal  terms 
because  of  the  vast  differences  in  the  sampling  design 
of  each  data  set,  and  orthogonality  depends  on  the 
design  of  data  sets.  The  strong  multicollearity  in  the 
less  sampled  data  set  makes  the  use  of  Ridge  regression 
necessary,  though  it  is  more  difficult  to  compare  the 
different  data  sets  based  on  the  fitted  parameters.  That 
is  the  reason  why  we  emphasize  that  the  parametric 
model  has  served  our  purpose  of  interpolation  within 
each  data  set  very  well,  but  the  fitted  parameters  have 
no  physical  meanings  and  have  vast  variations  across 
different  data  sets.  Another  important  lesson  is  that,  we 
have  not  enforced  a  good  statistical  design  so  that  the 
confounding  effect  of  measurement  technique  and  lab¬ 
oratory  effects  may  be  reduced.  In  the  future  when 
there  are  more  laboratories  who  can  use  multiple  tech¬ 
niques,  a  good  choice  of  experimental  design  may 
become  feasible. 

Based  on  the  results  of  the  round-robin  measurement 
survey,  BijTcj  will  be  used  for  the  SRM.  To  this  end, 
400  units  have  been  purchased  from  Marlow  Industries 
with  sample  dimensions  of  8  mm  x  3.5  mm  x  2.5  mm. 
This  sample  has  different  dimensions  than  those  used 
for  the  round-robin  measurement  survey  based  on  feed¬ 
back  from  the  participants.  These  dimensions  allow 
more  room  for  4-probe  resistivity  measurements  while 
maintaining  an  appropriate  thermal  conductance. 

Bi2Te3  will  be  certified  as  the  SRM  at  NIST  with  the 
standard  data  produced  using  a  Quantum  Design 
Physical  Property  Measurement  System  with  some 
modifications  including  party  electronics  and 
custom  software.  The  details  of  this  system  and  tech¬ 
nique  will  be  discussed  elsewhere. 
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